Skip to main content

Classifying a Data Set

A logged-in user with Read/Write or Admin Entitlement can Classify the Data Set through:

  1. The Data Set details screen by clicking on the ‘Classify Set’ button in the common header.

  2. The Data Set List screen by clicking on the ‘Classify Set' button.

Step 1: If the user clicks on this button, they will be directed to the list of catalogs that they have access to. They will need to select the Catalog whose Semantic Object they want to classify the Data Set with.

Step 2: Once Catalog is chosen, the Next Step button will get enabled and the user can go to the next page.

Step 3: In this screen, the user will be shown the list of all Semantic Objects from the Catalog and will need to select which Semantic Object they want to classify this Data Set to and proceed to the Next Step. The user is free to choose more than one as well.

Step 4: Next, the user is taken to a mapping page with the Data Set attributes on the left panel and the concepts grouped by Semantic Objects in the right panel. Here the user will see existing mappings (of that Dataset to concepts of Semantic Objects that were selected in Step 3).

Here the user can do two things.

  • Remove any incorrect mapping(s)- for example, here the user can remove PID mapped to phone no. if it seems incorrect.

  • Drag and drop additional Data Set columns from the left panel and map them with the Concepts on the right panel, if required.

Once this exercise is complete, then, on clicking the Run Model button, the user triggers the Classification Job which will consider these manual mappings and generate more mappings based on Machine Learning. The Job may take a little while because it runs for the complete Tenant to automatically manage new changes in the model. Additionally, Data Quality rules will also be run to take note of any changes due to changes in mapping during the model run.